会员推理攻击(MIA)在机器学习模型的培训数据上提出隐私风险。使用MIA,如果目标数据是训练数据集的成员,则攻击者猜测。对MIAS的最先进的防御,蒸馏为会员隐私(DMP),不仅需要私人数据来保护但是大量未标记的公共数据。但是,在某些隐私敏感域名,如医疗和财务,公共数据的可用性并不明显。此外,通过使用生成的对策网络生成公共数据的琐碎方法显着降低了DMP的作者报道的模型精度。为了克服这个问题,我们在不需要公共数据的情况下,使用知识蒸馏提出对米西亚的小说防御。我们的实验表明,我们防御的隐私保护和准确性与MIA研究中使用的基准表格数据集的DMP相媲美,我们的国防有更好的隐私式权限远非现有防御不使用图像数据集CIFAR10的公共数据。
translated by 谷歌翻译
This study proposes novel control methods that lower impact force by preemptive movement and smoothly transition to conventional contact impedance control. These suggested techniques are for force control-based robots and position/velocity control-based robots, respectively. Strong impact forces have a negative influence on multiple robotic tasks. Recently, preemptive impact reduction techniques that expand conventional contact impedance control by using proximity sensors have been examined. However, a seamless transition from impact reduction to contact impedance control has not yet been accomplished. The proposed methods utilize a serial combined impedance control framework to solve this problem. The preemptive impact reduction feature can be added to the already implemented impedance controller because the parameter design is divided into impact reduction and contact impedance control. There is no undesirable contact force during the transition. Furthermore, even though the preemptive impact reduction employs a crude optical proximity sensor, the influence of reflectance is minimized using a virtual viscous force. Analyses and real-world experiments confirm these benefits.
translated by 谷歌翻译
Principal Component Analysis (PCA) and its exponential family extensions have three components: observations, latents and parameters of a linear transformation. We consider a generalised setting where the canonical parameters of the exponential family are a nonlinear transformation of the latents. We show explicit relationships between particular neural network architectures and the corresponding statistical models. We find that deep equilibrium models -- a recently introduced class of implicit neural networks -- solve maximum a-posteriori (MAP) estimates for the latents and parameters of the transformation. Our analysis provides a systematic way to relate activation functions, dropout, and layer structure, to statistical assumptions about the observations, thus providing foundational principles for unsupervised DEQs. For hierarchical latents, individual neurons can be interpreted as nodes in a deep graphical model. Our DEQ feature maps are end-to-end differentiable, enabling fine-tuning for downstream tasks.
translated by 谷歌翻译
Humans demonstrate a variety of interesting behavioral characteristics when performing tasks, such as selecting between seemingly equivalent optimal actions, performing recovery actions when deviating from the optimal trajectory, or moderating actions in response to sensed risks. However, imitation learning, which attempts to teach robots to perform these same tasks from observations of human demonstrations, often fails to capture such behavior. Specifically, commonly used learning algorithms embody inherent contradictions between the learning assumptions (e.g., single optimal action) and actual human behavior (e.g., multiple optimal actions), thereby limiting robot generalizability, applicability, and demonstration feasibility. To address this, this paper proposes designing imitation learning algorithms with a focus on utilizing human behavioral characteristics, thereby embodying principles for capturing and exploiting actual demonstrator behavioral characteristics. This paper presents the first imitation learning framework, Bayesian Disturbance Injection (BDI), that typifies human behavioral characteristics by incorporating model flexibility, robustification, and risk sensitivity. Bayesian inference is used to learn flexible non-parametric multi-action policies, while simultaneously robustifying policies by injecting risk-sensitive disturbances to induce human recovery action and ensuring demonstration feasibility. Our method is evaluated through risk-sensitive simulations and real-robot experiments (e.g., table-sweep task, shaft-reach task and shaft-insertion task) using the UR5e 6-DOF robotic arm, to demonstrate the improved characterisation of behavior. Results show significant improvement in task performance, through improved flexibility, robustness as well as demonstration feasibility.
translated by 谷歌翻译
文本到图像模型最近通过光合现实质量看似准确的样本取得了巨大的成功。但是,随着最先进的语言模型仍在努力评估精确陈述,基于语言模型的图像生成过程也是如此。在这项工作中,我们展示了最先进的文本对图像模型(例如Dall-e)的问题,并通过与Draw基准基准相关的语句生成准确的样本。此外,我们表明剪辑无法始终如一地重新读取这些样品。为此,我们提出了Logicrank,这是一种神经符号推理框架,可以为这种精确要求设置提供更准确的排名系统。Logicrank平稳地集成到文本到图像模型的生成过程中,而且可以用于进一步调整更逻辑的精确模型。
translated by 谷歌翻译
我们考虑了持续的武装匪徒问题,在汇总反馈下的固定预算范围内推荐最好的武器。这是通过精确奖励不可能或获得昂贵的应用程序的激励,而可提供聚合奖励或反馈,例如子集的平均值。我们假设它们来自高斯进程并提出高斯工艺乐观优化(GPOO)算法来限制一组奖励功能。我们自适应地构造一个树的树,作为臂空间的子集,在那里反馈是节点代表的聚合奖励。我们为建议武器的汇总反馈提出了一个新的简单遗憾概念。我们为所提出的算法提供理论分析,并将单点反馈恢复为特殊情况。我们说明了GPoo并将其与模拟数据的相关算法进行比较。
translated by 谷歌翻译
我们在随机梯度下降(SGD)算法的逃生问题上发展了定量理论,并研究了损耗表面锐度对逃逸的影响。深入学习在各个领域取得了巨大成功,但是,它开辟了各种理论开放问题。其中一个典型问题是为什么SGD可以找到通过非凸损耗概括的参数。逃生问题是一种解决这个问题的方法,该方法调查了SGD如何从本地最小值逃脱。在本文中,通过应用随机动力系统理论,我们开发了逃生问题的准势能理论。我们表明,准势理论可以以统一的方式处理损耗表面的几何特性和梯度噪声的协方差结构,同时它们在以前的作品中分别研究。我们的理论结果意味着(i)损失表面的清晰度有助于SGD的缓慢逃逸,(ii)SGD的噪声结构取消效果并指数加速逃逸。我们还通过用真实数据接受培训的神经网络进行实验来经验验证我们的理论。
translated by 谷歌翻译
Scenarios requiring humans to choose from multiple seemingly optimal actions are commonplace, however standard imitation learning often fails to capture this behavior. Instead, an over-reliance on replicating expert actions induces inflexible and unstable policies, leading to poor generalizability in an application. To address the problem, this paper presents the first imitation learning framework that incorporates Bayesian variational inference for learning flexible non-parametric multi-action policies, while simultaneously robustifying the policies against sources of error, by introducing and optimizing disturbances to create a richer demonstration dataset. This combinatorial approach forces the policy to adapt to challenging situations, enabling stable multi-action policies to be learned efficiently. The effectiveness of our proposed method is evaluated through simulations and real-robot experiments for a table-sweep task using the UR3 6-DOF robotic arm. Results show that, through improved flexibility and robustness, the learning performance and control safety are better than comparison methods.
translated by 谷歌翻译